The traveling salesman problem, conformal invariance, and dense polymers 
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We propose that the statistics of the optimal tour in the planar random Euclidean traveling salesman 
problem is conformally invariant on large scales. This is exhibited in power-law behavior of the 
probabilities for the tour to zigzag repeatedly between two regions, and in subleading corrections to 
the length of the tour. The universality class should be the same as for dense polymers and minimal 
spanning trees. The conjectures for the length of the tour on a cylinder are tested numerically. 



The traveling salesman problem (TSP) is a classic 
problem in combinatorial optimization. The basic prob- 
lem is, given a set of N marked points ("cities") in the 
plane, to find the closed path (a cycle or "tour" ) of short- 
est length that passes through each city once. In the ran- 
dom uniform TSP, the cities are chosen randomly, and 
are independently and uniformly distributed over some 
bounded domain T>, say a square, with mean density n. 
While much effort has been expended on finding algo- 
rithms that produce the optimal tour for a given set of 
cities, for statistical physicists the interest of the problem 
is in the statistical properties of the optimal tour in the 
random uniform problem, to which we will refer simply 
as TSP [1,2]. In the past, much attention has been given 
to the total length I of the optimal tour, which for N 
cities in a square behaves as i(N) ~ (3 A/ a with proba- 
bility one as TV — ► oo [3,4], where A is the area of the 
square, a = y/A/N=l/y/n, is the typical spacing of the 
cities, and j3 is a constant: j3 ~ 0.7120 [5]. Finite N 
corrections to the mean length in a cube of dimension d 
with periodic boundary conditions have been studied [5] . 

In this paper, we consider geometrical properties of 
the optimal tour, other than the mean length for the 
square. These properties include the dependence of the 
mean length on the aspect ratio when the cities are dis- 
tributed in a rectangle or on the surface of a cylinder. 
They also include the connectivity of the path, which we 
quantify by defining the number of times the tour alter- 
nately enters (or "zigzags" between) two specified sub- 
regions. We formulate conjectures based on statistical 
conformal invariance [6] of the properties of the optimal 
tour over length scales much larger than a. For clarity we 
separate these conjectures and call them TSPI, II, and 
III. The conjectures are: I) conformal invariance of the 
distribution of tours, and hence power-law behavior of 
the probability Pk (r) for zigzagging k times between two 
regions that are a distance r apart (and much further 
from the boundary of T>), 



Pk(r) oc r 



(1) 



for r 3> a, for some exponents Xk- This implies that the 
optimal tour is a random fractal, and the Xk determine 
the fractal dimensions Dk = 2 — Xk of various sets as- 



sociated with the tour. These predictions, including the 
values of Xk , are universal; they do not depend on the pre- 
cise distribution of cities, which might even be correlated, 
as long as the distribution is translationally and rotation- 
ally invariant, and any correlations are short-range; II) 
the mean length £ of the optimal tour in a domain of area 
A that has a smooth boundary of length C has the form 
I/a = fin A + -fC/a + . . . (a weaker version of this has 
been proved for the square [7], and implies that 7 > 0). 
If we define A£/a = £/a — (3nA — jC/a, then we expect 
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as N — > co, where A > and c are constants, \T>\ is the 
diameter of T>, and x is t ne Euler number of T> (x = 
2 — 2h — b, where h is the number of handles, b is the 
number of boundaries). If the tour is also constrained 
to zigzag k times between two fixed regions far from the 
boundary, then we expect in addition 



A£/a ~ 2\x k ln(r/a) 



(3) 



for r/a — > 00 [here Xk are the same as in eq. (1)]. The 
values of /?, 7, and A are not universal; III) the universal- 
ity class for the TSP is the same as that known as dense 
polymers, so the exact values are [8-11] 



x k = (k 2 - 4)/16, c 



(4) 



After explaining these conjectures further, we study the 
length as in TSPII numerically by a transfer-matrix-like 
approach on a cylinder, testing the conformal symmetry 
proposed in I, and finding reasonable agreement with the 
quantitative conjectures in II and III. 

There is a possible relation with the minimal span- 
ning tree (MST) problem (given N cities, find the tree 
of smallest length with those cities as vertices; the cities 
are chosen at random as in the TSP). Analogs of parts of 
our TSPI and III have already been discussed for MSTs 
[12,13]. Here the analog of TSPIII would involve so-called 
uniform spanning trees (USTs) in place of dense poly- 
mers. The equivalence of USTs and MSTs in d = 2 is 
not excluded by rigorous results [12], and in our view 
is supported by existing numerics [13]. A tree in two 
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dimensions is equivalent to a nonintersecting loop (take 
the boundary of a "thickened" tree), and the universality 
classes of USTs and dense polymers are the same in d = 2 
[14] (and also the same as stochastic Loewner evolution 
at parameter value k — 8 [15]). Hence in two dimensions 
we expect the universality classes for TSP and MSTs to 
be the same (the Hausdorff dimension studied in Ref. 
[13], which is equal to 5/4 for USTs [16], corresponds to 
our Di). 

We now further explain our predictions. First, we note 
that the optimal tour cannot cross itself, as that would 
allow a shorter tour to be found. For a typical set of 
cities, the tour comes close to every point in the region 
considered. In the scaling limit a — > for a fixed A, the 
path becomes a random space-filling (Peano) curve with 
Hausdorff dimension D = D2 = 2 [3] . For a tour within a 
simply-connected domain, the interior of the curve is well 
defined, and can be shaded black, leaving the exterior 
white. The interior and exterior form interlocking trees 
which appear statistically alike, except near the bound- 
ary of the domain. This implies a self-duality to the tour. 
One strongly suspects that such a random self-dual curve 
must be scale, and very likely also conformally, invariant, 
in a sense we must now explain. 

Consider a square at arbitrary position well within the 
interior, with side L much less than the size of the do- 
main, and with edges parallel to the coordinate axes. The 
tour passes through this square some number of times, 
entering and leaving on two sides (not necessarily dis- 
tinct) of the square. We examine the number n x > 
of times the tour crosses the square in the x-direction, 
that is the number of segments of the tour that have one 
end on each of the two edges parallel to the y-axis. Sim- 
ilarly, we can consider the number of times n y it crosses 
the square in the y direction. If n x > 0, then n y = 0, 
and vice versa. We expect that the joint probability dis- 
tribution for n x , n y is concentrated at small values of 
n x , n y . Then the expectation will be of order 1. As 
L/a increases, will remain nonzero, as the tour must 
occasionally cross the square. (The possibility that the 
probability of n x = n y = approaches 1 appears to be 
excluded, because of the requirement that the tour be a 
single cycle.) Thus we expect that the joint probability 
distribution for n x and n y is scale invariant for large L/a. 

Consider two disks A, B of radius r , the centers of 
which are separated by r > 2ro. Let -Pfc(r) be the prob- 
ability for crossing (zigzagging) between the two disks 
precisely k times (k even); more precisely, it is the prob- 
ability that k distinct connected segments of the tour 
lying outside both disks have one end on the boundary 
of each disk. By standard arguments, scale invariancc 
of the crossing probabilities (applied to annuli concentric 
with A or B) leads us to expect eq. (1) to hold as a func- 
tion of r, for r»a and a, A, B, fixed (and with r much 
less than the distance of A or B from the boundary of 
V), and that x k > for k > 2. With D 2 = 2, it follows 



that X2 = 0. 

We may define similar crossing or "fc-leg" probabili- 
ties for k odd by allowing the path to end at any two 
of the marked cities, such that the total length is min- 
imized. For open paths, we define -Pfc(r) for k odd as 
the probability that the path starts in A and ends in B, 
crossing between them k times. We expect x\ < 0, mean- 
ing that the optimal path will usually have its ends far 
apart. These definitions of the 2-point correlations can 
be easily extended to general n-point functions which are 
probabilities for the path to pass between n disks in some 
specified sequence. As for n = 2, we expect these to pos- 
sess scaling limits as a —> 0, and these define a probabil- 
ity distribution on non-self-intersecting self-dual space- 
filling curves, which will be universal, and which we wish 
to characterize. The definition of the fc-leg events can be 
generalized to the case when a disk is close to a boundary 
of the domain, which is assumed here to be straight. For 
n = 2, if the distances of A and B to the boundary are 
both much less than their separation r, then we expect 
Pfe(r) oc (r/a)~ 2 ' Xk , with universal exponents Xk different 
from Xk (again, x\ < 0, and x 2 = 0). 

Conformal invariance is expected in scale- invariant sta- 
tistical problems when they are defined by local processes. 
In the TSP, the length which is to be minimized is local 
in the sense that it is the sum of small local steps. There 
may be a concern that the global constraint of visiting 
each city once violates locality. However, such a condi- 
tion is also present in dense polymers, so is not necessarily 
an issue. 

For the following arguments, and for numerical pur- 
poses, it is convenient to consider the TSP on a cylinder, 
with circumference W (i.e. in the region < x < W 
in the plane with a periodic boundary condition in the 
x-direction) , and length L, so A = LW. The cities are 
uniformly distributed over this region. In this case, scal- 
ing arguments suggest that the probability that the path 
crosses at least k times between two regions within ro of 
the ends behaves for N -> 00 as P k (W,L) cx e - 2nx " L / w 
for L/W 3> 1. If conformal invariance holds, then the 
exponents Xk here are the same as those defined above 
in the plane, by using a conformal mapping of the plane 
to the cylinder [17]. 

Now suppose that, instead of the tour being uncon- 
strained, the tour is required to cross at least k times 
between the end regions, no matter what the positions 
of the cities. As the tour must minimize its length, for 
k > 2 the mean length Ik of the constrained tour will 
be greater than or equal to that of the unconstrained 
(k = 2) one, by an amount oc L. In fact, from scale 
invariance we expect that, for each k, as N — > 00 with 
L/W fixed, Alk/a = lk/a - (3nA - 2jW/a is propor- 
tional to a universal function of L/W, and this func- 
tion is oc L/W as L/W — > 00. We further expect 
that, for k even, the change in length of the constrained 
tour {Ik — £2) /a is proportional to the logarithm of the 
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probability for k crossings in the unconstrained case: 

(4 -h)/a A In P k (W,L) as a -> 0. Thus for large 

L/W, we have 

{h~h)/a~2ir\x k L/W, (5) 

where A > is a non- universal constant, but is the same 
for all k, and we expect this form to hold for k odd also 
[conformally mapping this to the plane yields eq. (3)]. 
We expect similar behavior also for the unconstrained 
tour, and so we define a universal constant c by 

Ah/a~-^L/W. (6) 

One expects then A > and c < 0. 

The appearance here of only a single non-universal con- 
stant A, and the various scaling forms suggested, require 
further explanation. We are using an analogy with con- 
formal field theories (CFTs) in two-dimensional critical 
phenomena. There, the variation of the free energy (the 
logarithm of the partition function) with respect to the 
metric of the space defines the stress tensor of the theory. 
This leads to universal scaling forms for the subtracted 
free energy in various geometries or with constraints or 
sources imposed, as in a correlation function [17-19]. Our 
central conjecture is that, up to a non-universal factor A, 
the length behaves as the free energy in some CFT (c is 
then the central charge). The reason is that the length 
£/a is the integral over space of a local density, and in 
many CFTs (including the dense polymer theory consid- 
ered below) the stress tensor is (up to a factor) the only 
local spin-2 operator of the correct symmetry and lowest 
scaling dimension that can appear as the variation of the 
length with metric. As free energy differences also deter- 
mine probabilities for events, the relation of the length 
differences with the probability also follows. By such 
arguments in the plane, or in other smooth domains (in- 
cluding non-planar ones, such as the sphere or torus), one 
obtains the scaling form (2) [19]. For geometries in which 
X is zero, the term in eq. (2) is replaced by A times a uni- 
versal scale- invariant functional of the geometry [18], as 
for the cylinder in eq. (6). 

There is a probability distribution for non-self- 
intersecting self-dual space-filling curves that arises from 
statistical mechanics. This is the Nienhuis dense-polymer 
phase that originally arose from the low-temperature 
phase of the O(m) loop model at m — ► [8]. It is a 
model of closed loops on the honeycomb graph, on which 
each edge of the graph is occupied at most once. The 
partition function of the model is Z = ^ K E m c , where 
the sum is over all such loop configurations, K plays the 
role of inverse temperature, E is the number of edges 
occupied, and C is the number of distinct loops. When 
m = 0, Z = 0, but the partition function for a single 
closed loop on the honeycomb graph can be obtained by 
differentiating, Z' = dZ/dm\ m=0 , and then probabilities 



Pfc(r) can be defined analogously to those above. The 
model has a critical point at K = K c = (2 + \/2) _1 ^ 2 , 
and the region K c < K < oo is a conformally-invariant 
low-tempcrature phase in which the scaling exponents are 
independent of K. The scaling limit of the probability 
distribution is highly robust: no local perturbations are 
relevant, except for that of allowing the loops to cross 
[20]. Therefore we find it natural to propose TSPIII, 
that the TSP and dense-polymer universality classes are 
the same. For dense polymers, c = —2, and the fc-leg 
exponents are given by eq. (4) for the bulk, and by [10] 

x k = k(k - 2)/8 (7) 

for the boundary. 

We have studied the length of a tour on a cylinder by 
extensive numerical calculations. The approach is similar 
to transfer matrix methods in statistical mechanics. The 
tour on a cylinder is "grown" starting from one end of 
the cylinder. The distance y along the cylinder plays the 
role of time and will be denoted t from here on. We 
set a = 1. The transfer process starts with a city p\ at 
x\ = 0, t\ = from which k lines emerge (we say it 
has valence k), and terminates with a similar city pn at 
some Xn and tjv- The remaining N — 2 cities pi have 
coordinates Xi, U, and valence 2. The variables Xi and 
differences t% — U-i arc all independent for i > 1. The 
Xi s are chosen uniformly in [0, W], while the differences 
U — U-i > are exponentially distributed with mean 
1/W (this reproduces the Poisson process with density 
!)• 

For a given set of cities, and for each time U, the infor- 
mation needed to complete a tour and find the optimal 
one consistent with the constraint is encoded in a set of 
states, each of which has a weight (length of path) associ- 
ated with it. The transfer matrix will evolve these states 
and weights to U+\. A state consists of a list of the M 
cities pi a , a = 1, . . . , M (ii < %2 < • • • im < i), whose va- 
lence has not yet been satisfied (by connection to other 
cities), plus connections among these cities. Of the M 
cities, there may be some that are not connected to any 
other one, some that are connected to just one other, and 
there is always one distinguished set of at most k cities 
that are connected to each other. To each of these states, 
we can associate the set of paths (with minimum total 
length and no closed loops) that form the connections 
and satisfy the valence of all the cities pj, j < i, not in 
the list; the distinguished set of k cities are connected to 
the initial city p\ (which itself will be in the list during 
the early stages). The length of this set of paths is the 
weight associated with the state. 

When a city p i+ i of valence 2 is added, the states and 
their weights at t i+ i are related to those at ti by one of 
three moves: (1) Pi+i may be unconnected to any other 
city; (2) Pi+i may be directly connected to one other city 
p ia in the list; (3) pi + \ may be directly connected to two 
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l_2l{LW) 
£i_/{LW) 
ts/(LW) 



Ac 

Ax'i 

Xx 3 



W = 2 



W = 3 W = 4 W = 5 W = Q 



1.07497(2) 0.8330(1) 0.7571(1) 0.7330(2) 0.720(2) 
0.7457(4) 0.7175(3) 0.715(1) 0.713(2) 
1.5286(1) 1.075(1) 0.872(1) 0.80(1) 0.76(1) 



-2.39 -3.33 
-0.15(1) -0.20(2) 
0.30(2) 



-2.98(1) -2.04(2) -2.03(15) 
-0.15(2) -0.10(5) 
0.33(2) 0.29(1) 0.26(2) 



TABLE I. Length Ik of tour per unit area, for each width 
W , extrapolated to i max — * oo, with values of Ac, \xk deduced 
from £k for pairs W , W — 1. Data for W = 1 are not shown. 

other cities p ia , p ih , so it will not appear in the list (if p ia , 
p ib were connected to each other at time U, this move is 
forbidden). In moves (2) and (3), cities pi a or p ib whose 
valence becomes satisfied disappear from the list, and 
the connections of other cities change accordingly. The 
weight of a state at time t^ + i is equal to that of the state 
it came from at time ti plus the length of the connections 
that have been added. If a state at time U+i can arise 
from more than one state at time ti, its weight is taken 
as the minimum of the various possibilities. At the final 
time tjy, a fc-valent city is added (using similar moves), 
and we take the length i k to be the length of the state 
at t N in which all valences are satisfied (the condition 
of not connecting already-connected cities is dropped at 
this step). 

Clearly, with the above moves, the size of the state 
space grows without limit as i — ► oo. In the spirit of 
heuristic (local optimization) algorithms we deal with 
this by discarding all states at each time ti that contain 
a city Pi a with i a < i — i max . The transfer matrix is then 
finite dimensional, with a size that grows exponentially 
with z max . This truncation means that steps with a long 
i-component are suppressed. For z max 3> W, they would 
be rare on the optimal path anyway. 

We have produced results for £fc(i max ) with 1 < W < 6 
and 4 < i max < 8 or 9. In each case, we find the opti- 
mal tour for N = 10 5 • W cities, and then average over 
10 independent samples. The systematic error due to 
the finite value of N is negligible as compared to the 
statistical error (sample-to-sample fluctuations). To ex- 
trapolate to the i max — > oo limit we use the Ansatz 
4(w)/(W) = h(oo)/(LW) + Ae- Bi ~~ , A and B be- 
ing fc-dependent constants, which matches the data very 
well, especially when i max > W. In Table I we display 
the extrapolated data for k = 2, 1, 3. Estimates for 
Ac (Ax/c, k = 1, 3) were based on eq. (6) [eq. (5)], us- 
ing £ 2 (It — l-i) for pairs W, W — 1 (7 was neglected). 
We obtained a value (3 — 0.7119(3), in good agreement 
with Ref. [5]. Our final best estimates are Ac = —2.0(2), 
\X! = -0.15(5), and Xx 3 = 0.28(4). We note that the 
values A = 1, c = —2, x\ = —3/16, and x 3 = 5/16 lie 
within the error bars. At present, we have no explanation 
for why A should be close to one. 

Finally, for TSP in a three-dimensional region of fixed 



thickness L 3 in the z-direction and a domain T> of area 
A 3> L\ in the x-y plane, projecting the tour into the 
plane produces a two-dimensional problem, but the pro- 
jected tour will occasionally cross itself. If our conjec- 
tures hold for L 3 = 0, then by Ref. [20] any small, pos- 
itive L 3 causes a crossover to a "Goldstone phase". It 
then seems very likely that TSP in all d > 2 is also de- 
scribed by the Goldstone phases, which would mean that 
the segments of the tour in a box of side L have Haus- 
dorff dimension 2, and behave as Brownian walks on large 
scales. Conformal invariance would be lost in these cases. 
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